-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
desi_use_reservation for prods #2346
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this incredibly useful script. I have added a few comments inline. One documents something we discussed in person that should be documented for the future, two others require very minor changes to make the code more robust. If this merge is time-critical we can proceed without the corrections since the script will be fine >99% of the time, but the additional robustness would be nice to have.
#- which regular queue partition is eligible for this reservation? | ||
regular_partition = resinfo['partition'] | ||
|
||
#- Determine CPU vs. GPU reservation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in person, this should be improved in the future to "future proof" it for later systems. For now it works fine on Perlmutter and we don't yet know how things may change for NERSC 10, so this is fine to leave as-is for now.
@akremin thanks for the comments. I addressed your two comments that needed updates; please re-review. In the meantime I also added logic to try to prevent a major imbalance of pending tilenight/ztile vs. flat jobs. In the end I'm not actually sure that helps our situation today of a backlog of GPU jobs preventing us from submitting more CPU jobs because either way we have to run those GPU jobs (flat or tilenight or ztile). Thoughts? |
I see the additions for job balancing but I don't see either of the requested changes. Is it possible you forgot to push? As we discussed earlier the load balancing isn't a bad thing since it allows us to complete earlier nights before moving on to processing flats+science exposures on later nights. But all gpu jobs need to run at some point for all nights to be complete, so it won't change the amount of time to completion of the production. From a human perspective though it is nice to complete nights before moving on in a depth-first strategy, so I like this change. I might even advocate for 10x instead of 20x. We have 12 flats in a night and ~10-30 tilenights. So 20 is a major imbalance. |
Missing changes pushed. I left the imbalance checking code, but also left it at only correcting the imbalance if it gets way out of whack with a factor of 20. The original reason of wanting to get the flats processed sooner is that they unblock the nightlyflat, which then unblocks the science tiles making them eligible to run in the regular queue even without a reservation. Running an equivalent number of tilenight/ztile jobs doesn't unblock as much. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks the changes look good and my requested changes have been implemented.
This PR adds a new script
desi_use_reservation
to assist in moving jobs from the regular queue into a batch reservation while running productions. It moves "just enough" jobs to fill the reservation, but then pauses so that we don't overfill it becauseThe script auto-derives the size of the reservation, whether it is a CPU/GPU partition, and what regular queue jobs are eligible to be run. It prioritizes "bottleneck" jobs like ccdcalib, nightlyflat, psfnight over regular jobs like arc, flat, tilenight, ztile. It only moves jobs into the reservation if they are not waiting on a dependency so that we don't fill the reservation backlog with jobs that can't run anyway. As a consequence, this should be run fairly frequently so that it can move jobs that have become newly eligible.
I have been testing and refining this today with the Kibo run using reservations kibo26_cpu and kibo26_gpu. I'm not done with ideas for additional improvements, but I'll cut myself off from "one more thing" and get this PR out for review. Note: it is safe to run this in a different environment from the production itself, since it is just moving jobs around, not actually submitting jobs that need the right environment.
I updated
desispec.workflow.queue.get_jobs_in_queue
to include a RESERVATION column. Otherwise the functionality is currently contained inside thebin/desi_use_reservation
, though pieces are broken out into functions that could be moved intodesisepc.workflow.queue
anddesispec.scripts
as needed.Example usage
Dry run, run once and exit
Check status of reservation and recommend what to do, but don't actually do anything:
Update reservation in a loop
Actually move jobs from the regular queue into the reservation, entering a loop checking every 5 minutes until 2024-08-26T17:30. Include 10 extra nodes worth of jobs so that there is a little buffer of jobs finishing and new ones starting before the next check: